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Many biological and man-made networked systems are characterized by the simultaneous presence of 
different sub-networks organized in separate layers, with links and nodes of qualitatively different types. 
While during the past few years theoretical studies have examined a variety of structural features of complex 
networks, the outstanding question is whether such features are characterizing all single layers, or rather 
emerge as a result of coarse-graining, i.e. when going from the multilayered to the aggregate network 
representation. Here we address this issue with the help of real data. We analyze the structural properties of 
an intrinsically multilayered real network, the European Air Transportation Multiplex Network in which 
each commercial airline defines a network layer. We examine how several structural measures evolve as 
layers are progressively merged together. In particular, we discuss how the topology of each layer affects the 
emergence of structural properties in the aggregate network. 

In the past fifteen years, network theory 1 3 has successfully characterized the interaction among the constituents 
of a variety of complex systems 4,5 , ranging from biological 6 to technological 7 , and social 8 systems. However, up 
until recently, attention was almost exclusively given to networks in which all components were treated on 
equivalent footing, while neglecting all the extra information about the temporal- or context- related properties of 
the interactions under study. Only in the last three years, taking advantage of the enhanced resolution in real data 
sets, network scientists have directed their attention to the multiplex character of real-world systems, and 
explicitly considered the time-varying 914 and multi-layered 15 26 nature of networks. 

A paradigmatic example of intrinsically multiplex system is represented by the Air Transportation Network 
(ATN). The ATNs have undergone a very significant growth during the last decades, giving rise to the dense and 
redundant system we know nowadays 27 . In the ATN, nodes represent airports, while links stand for direct flights 
between two airports. On the other hand, each commercial airline corresponds to a different layer, containing all 
the connections operated by the same company. While a considerable effort has recently been devoted to the 
characterization of the structural properties 28 30 of ATNs and their role in the dynamical processes taking place on 
them 31 34 , their multiplex nature has remained almost unexplored. 

When studying systems that can be represented as a graph made of diverse relationships (layers) between its 
constituents, an important question, typical of complex systems analysis, arises: can the topological properties of 
the whole system be traced to those of its layers or do they emerge from the simultaneous presence of multiple 
layers? Emergence is said to happen when the focus is switched from one scale to a coarser level of description. 
This question can be addressed by comparing the most usual structural properties of the multiple layers com- 
posing a network 35 and their analogue in the aggregate representation of the network, in which the layer structure 
is disregarded. 

To address the above question we resort to the European ATN data set. Taking advantage of the high- 
resolution of these data, comprising a number of airlines (layers) operating in Europe during the year 2011, 
we succeed to extract the multiplex character of the system, and we investigate how the structural properties 
usually observed in the ATN are here emerging as a result of progressive layer merging. To this end, we quantify 
various topological measures, such as the degree distribution, the clustering coefficient or the presence of rich- 
club effect, in networks obtained by merging together a growing number of layers, from the lowest level of 
resolution of a single layer, up to the fully aggregate network. In addition, we compare two different types of 
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layers, those corresponding to major (national) airlines and those 
labeled as low-cost companies. We analyze their structural differ- 
ences, and their different contribution to the properties of the global 
ATN. 

Results 

The European ATN can be represented as a graph composed of M = 
37 different layers each representing a different European airline (see 
Methods for details). Each layer m has the same number of nodes, N, 
as all European airports are represented in each layer. Furthermore, 
the data set allows extracting two main subsets, comprising all major, 
and low-cost airlines, with 18 and 10 layers respectively (See Fig. 1). 
In particular, panels (a) and (b) display the structure of the aggregate 
network focusing first on its redundancy by sketching those links 
belonging to more than one layer and on its unicity by reporting 
those links that only exist in a specific layer. Panels (c) and (d) show, 
instead, the single-layer ATN corresponding to a given major and 
low-cost airlines, respectively. In each of the panels we highlighted 
the nodes with the highest number of connections. 

Topological measures. To characterize the structural properties of 
both the aggregate ATN and its layers, we consider several features 
widely used in network literature 35 , i.e. cumulative degree distri- 
bution P>(fc), clustering coefficient C, size of the giant component 
S, average path length L and Rich-club coefficient R. We briefly 
describe below the specific meaning of each of these measures in 
our context. The interested reader will find a complete description 
of all those quantities in the Methods section. 

• The cumulative degree distribution P> (k), gives the probability of 
finding a node with a number of connections (or degree) equal or 
greater than k. The degree distribution is a powerful tool which 
allows understanding both structural and dynamical character- 
istics of a system as, for instance, its tolerance to attacks or fail- 
ures 3637 so it represents a cornerstone in the characterization of 
critical infrastructures, such as the ATN. 

• The average path length (L), measures the average number of 
hops one has to make to go from a node to another. In the context 
of ATNs, it indicates the average number of flights a passenger has 
to take to go from his/her origin to his/her destination. However, 
if the system is not connected, this quantity diverges and it is 
preferable to restrict attention to the giant (largest) component 
of the system (see below). 

• The clustering coefficient C, measures the probability, C e [0, 1], 
that two nodes with a common neighbor are connected together. 
C is a typical measure in systems made of social acquaintances 8 , 
but in our case it is useful to estimate the density of triangular 



motifs (denoting the possibility of performing round trips of 
length 3). 

• The size of the giant component 38 S, denotes the largest fraction of 
overall nodes such that any pair of them is connected through a 
path of finite length. In our case, it estimates the largest coverage 
that a given airline (or a combination of them) provides in terms 
of the available destinations that a passenger can reach from an 
origin inside the giant component. 

• The Rich-club coefficient 39 R, measures the tendency of highly 
connected nodes, i.e. the hubs, to be connected among them- 
selves. To measure it, one has to compute the abundance of links, 
<t>(k), among nodes with a number of connections equal or greater 
than a certain value k, and the maximum possible number of links 
among those nodes, ${k) max . Then, the ratio between these two 
quantities gives the relative abundance of links among nodes with 
at least k connections. Finally, R(k) is given by the ratio between 
the abundance of links in the real case <t>(k)/<l>(k) max and the same 
quantity calculated in a proper randomized version of the original 
network. Colizza et al. 2 " measured R for the ATN, and found that 
world air transportation network displays indeed a Rich-club 
effect, i.e. for large values of k the value of R(k) is larger than 1. 

Emergence of topological properties of the European ATN. We 

now analyze the evolution of the former measures as more and more 
layers are merged (independently of whether they do correspond to 
major or low-cost companies), until the complete aggregate ATN, 
comprising all the available layers, is reached (see the Methods 
section for the details on the layer merging procedure). The results 
are shown in Fig. 2. 

In panel (a) we show the evolution for the cumulative degree 
distribution of the aggregate ATN and those networks obtained by 
merging 1, 5 and 20 randomly chosen layers. Since right-skewed 
distributions often display high noise levels at the end of their tails 
due to the lack of statistics, it is convenient to consider the cumulative 
distribution instead of the distribution P(k) itself 5 . A power-law 
behavior P>(fc) oc k~* is observed in all the situations considered, 
with a decrease in the exponent a, ranging from a = 1.84 in the single 
layer case (m = 1) to a = 1.39 for the aggregate ATN. The increase in 
heterogeneity with the number of layers considered points to a 
richer-gets-richer phenomenon different from the one seen in clas- 
sical models for growing scale-free networks: while in the latter case, 
it results from the addition of new nodes, in the present case it 
emerges from the addition new layers. 

In panel (b) we report the clustering coefficient. In this case, we 
show the behavior of (C) as a function of the number of layers used to 
construct the aggregate ATN, averaged over the number of different 
combinations of m elements (m = 1, M). Interestingly, we see 
how the clustering suddenly increases as we merge just a few layers: 




Figure 1 | Visual representation of the ATN. From left to right: the aggregate network of all the layers in which only links belonging to more 
than one layer are displayed. The same network but in which we display those links which belongs to only one layer and connecting at least one node with 
degree greater than or equal to 75. An example of ATN network of a major airline and, finally, the network of a low-fare (low-cost) airline. In each 
network, the airports with the highest degree are highlighted. 
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Figure 2 | Evolution of topological properties of the complete ATN network, (a) Average cumulative degree distribution P>(fc) for groups of layers 
merged together: single layers (• ), five layers (■), twenty layers (♦) and the aggregate (A), (b-c-d) Average clustering (C), size of giant component (S), 
path length (L) as a function of m. (e) Link abundance for nodes of degree k or greater, <l>(k) divided by its maximum (l>{k) max for the aggregate network in 
both real case ( ■ ) and its randomized version (• ). The vertical dashed line represent the value of k at which the difference among the two curves is 
maximal, (f) A subset of the aggregate network showing the connections among those nodes whose degree is greater than (or equal to) 47. The size of the 
nodes is proportional to the degree. 



to achieve more than 80% of the final clustering value, we only need 
to randomly merge together five layers. This result indicates that the 
large density of triangles present in the ATN is a consequence of the 
merging of different layers rather than a single-layer property. Thus, 
in order to make round trips of length 3 one should make use, most of 
the times, of more than one airline. 

The former result contrasts with the picture obtained for the 
evolution of the size of the giant component (S). Panel (c) describes 
a monotonous and progressive increase of the coverage as more 
layers are aggregated. In fact, around 40% of the European cities 
are covered when merging together five randomly chosen layers. It 
is worth noticing that (S) also tells us that we are considering a system 
which is already above the percolation threshold, so that every step 
towards the aggregate network produces an increment in the collec- 
tion of reachable destinations (see the value of (S) for m = 1). 
However, the behavior of the transition for the average path length 
(L) (restricted to those nodes in the giant component) in panel (d) 
shows a rise-and-fall behavior indicating that combining few layers 
results in the merging of unconnected components at the aggregate 
level, causing a fast increase in its length. On the other hand, after the 
maximum for L is reached, the addition of new layers has a twofold 
effect on the giant components: it incorporates new nodes, but also 
creates alternative links between already present nodes. Thus, the 
average path length of the giant component balances the addition 
of new destinations with the creation of new links, and suffers a slow 
decrease when increasing m. 

Finally, panel (e) shows, only for the aggregate network, the exist- 
ence of a Rich-club effect quantifying the abundance of links between 
nodes with degree larger or equal to k, (f)(k), normalized with respect 
to its maximum. This quantity is computed both for the real ATN 
and for a set of randomized versions of the network in which all the 
links are rewired keeping the same degree sequence of the original 
network. This randomization aims at destroying any kind of correla- 
tion between the local properties of connected nodes. From the figure 
it is clear that initially the two curves coincide indicating that the 



existence of flights between airports with few connections (less than k 
= 30) is equally probable in the ATN and in its randomized version. 
Instead, for fce[30,60] the points corresponding to the real ATN stand 
above those corresponding to the randomized network. This result 
points out that the aggregate ATN displays Rich-club effect (the 
largest effect being found for k = 47), thus confirming for the 
European case the findings of Colizza et al. 2S for the ATN. The 
existence of such effect is quite logical, as usually highly connected 
nodes correspond to the principal airports of the main European 
cities which, in most of the cases, are connected among themselves 
via direct flights. Finally, for k>60 the fluctuations of the randomized 
case are too large for any statement to be made on the existence of a 
Rich-club effect. 

Major versus low-cost layers. The European ATN is composed of 
layers corresponding to airlines of different types. In particular, we 
find among them major (national, such as Lufthansa), low-cost fares 
(such as Easyjet), regional (such as Norwegian Air Shuttle) or cargo 
(such as Fed-Ex) airlines. These kinds of airlines have developed 
according to different structural/commercial constraints. For 
instance, it is known that major airlines are designed following the 
so-called hub and spoke structure, to provide an almost complete 
coverage of the airports belonging to a given country 40,41 and 
maximize efficiency in terms of national transportation interests. 
Low-cost companies, instead, tend to avoid overly centralized 
structures and, to be more competitive, typically cover more than 
one country simultaneously. To unveil the role that each type of 
airline plays in the emergence of the topological features of the 
aggregate ATN, we considered two subsets of layers respectively 
comprising only it majors and low-cost airlines. The results of this 
study are shown in Fig. 3. 

We first address the cumulative degree distribution P>(A:). In the 
two panels (a) and (b) we show the distributions P>(k) for major (a) 
and low-cost (b) layers when considering different levels for the 
merging of the layers of the same kind. For major airlines, the typical 
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Figure 3 | Evolution of topological properties of major (■) and low-cost (A) subsets, (a-b) Average cumulative degree distribution P>(Zc) for 
different number of layers merged together, (c-d-e-f) Average clustering (C), number of triangles (n 3 ), size of giant component (S), path length 
(L) as a function of the number of layers merged. The insets display the same quantities in the case of the complete set. (g-h) link abundance for the 
aggregate network. The vertical dashed line represents the value of k at which the difference among the two curves is maximal. 



trend of a single layer (m = 1) displays a plateau for moderate values 
of k, indicating a centralized character of this kind of layers, with few 
hubs having remarkably higher than average connectivity. In addi- 
tion, when merging more layers (m = 10 or all the major airlines) the 
trend shows a rather continuous decay due to the combination of 
hubs of different size (depending on the nation of the airline). Notice 
that a hub of a single layer (a single national airline) is highly con- 
nected within the same country, but also has some flights to capitals 
of other European countries which, in turn, are hubs of their corres- 
ponding major layers. On the other hand, the cumulative distri- 
bution of typical low-cost airlines shows a rather different pattern, 
as its decay is rather progressive, and airports of different size coexist 
within the same layer. 

The differences in organization of low- cost and major airlines is 
further highlighted by the behavior of the clustering coefficient (C). 
Panel (c) shows how major airlines display sharp increases in (C) as 
more major layers are merged, followed by a plateau for m > 5. This 
saturation of C is due to the fact that, when merging major layers 
randomly, national hubs tend to connect together (we have already 
discussed this fact when introducing the Rich-club effect) in the 
aggregate network. The saturation of clustering is, however, not 
observed for the aggregate ATN [see Fig. 2.(b) or the inset in panel 

(c) ] for which C(m) always increases. This is due to the fact that the 
merging of low-cost layers leads to a continuous formation of new 
triangles, thus increasing the clustering with m. In addition, in panel 

(d) we show the evolution with m of the average number of triangles, 
(n 3 ), normalized with respect to the total number of triads in the 
aggregate network for both major and low-cost layers. Interestingly, 
the monotonic growth of (n 3 ) reveals that the saturation of the clus- 
tering coefficient when m = 5 for major layers is not due to the fact 
that new triangles are not added when m > 5 but to a balance 
between the new triads and the new connections added when mer- 
ging additional layers. 

The behavior of the giant component (S), normalized with respect 
to the total number of destinations covered by each kind of airline 
(see panel (e)), does not give any particular insight in terms of differ- 
ences between low-cost and major airlines, except for the fact that in 
the low-cost case we observe larger fluctuations, mainly due to the 



large variability in size of the giant component of single layers. On the 
other hand, the picture described by the average path length (L) in 
panel (f) is very interesting. Major and low-cost subsets behave rather 
differently not only between them, but also with respect to the evolu- 
tion of the complete set (see inset). For layers corresponding to major 
companies, (L) increases with the number of merged layers. The 
interpretation of this continuous growth is straightforward: each 
time a layer corresponding to a major airline is added, even if it shares 
some common destinations (say some European capitals having 
their corresponding major airlines within the original set of merged 
layers), the number of new available nodes (small destinations only 
available through the new added major layer) is large enough to 
generate an increase in L. On the contrary, the case of low-cost dis- 
plays a rise-and-fall in the behavior of (L), due to the large coverage 
of European countries/cities that already each single low-cost layer 
displays. Thus, as we merge some of them together, they already 
cover nearly all the low-cost destinations, and merging of additional 
layers just adds new connections between them. When combined 
into the original ATN, these two different trends lead to the saturated 
evolution of (L)(m) shown in the inset. 

Finally, we examine once again the onset of the Rich-club effect. 
From panels (g) and (h) we notice how the graph corresponding to 
the aggregate network constructed by merging layers corresponding 
to major airlines (g) displays the presence of a rich club for fc = 38 
(almost the same value as in the case of the total aggregate ATN). 
Interestingly, the Rich-club effect is absent when merging low-cost 
layers so that, while in the case of major airlines the merge of layers 
containing large hubs ends up in a system composed of a connected 
core of highly connected nodes, the more distributed nature of the 
low-cost layers prevents the formation of a Rich-club. Thus, a rel- 
evant conclusion is that the well-known 28 Rich-club effect observed 
in ATNs is exclusively related to the presence of major airlines. 

Discussion 

The characterization of the interaction patterns in large systems has 
recently been spurred by the incorporation of the paradigm of multi- 
plexity. Taking advantage of the European ATN data set, with details 
of the airlines operating each flight, we showed that the topological 
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properties of the ATN are generally not present in single layers, 
rather they are the consequence of an emerging phenomenon inti- 
mately related to the multilayer character of the system. We also 
pointed out that the merging of low-cost and major (national) layers 
leads to the emergence of qualitatively different aggregate networks. 
Finally, we demonstrated that the combination of these two different 
behaviors accounts for the many important structural features of the 
global ATN, such as the Rich-club effect (mainly due to the layers of 
major airlines), path redundancy (resulting from a cooperative com- 
bination of the clustering of low-cost and major layers), or small- 
worldness (remarkably enhanced by the presence low-cost layers). 

Our study highlights the importance of considering the multiplex 
character of most real networked systems, and shows that consider- 
ing layers as relevant entities of a network (such as nodes and links at 
the micro-scale or communities at the meso-scale) will contribute to 
a better understanding and modeling of dynamical processes taking 
place at the level of aggregate network. 

Methods 

Dataset. The data analyzed in this paper are taken from the complete list of 
airlines operating Instrumental Flight Rules (IFR) flights between European 
airports on a certain day obtained from EUROCON-TROL and the Complex 
World Network in the context of the SESAR Work Package E 42 . We selected only 
those airlines whose number of destinations is above the average (which is 32), 
obtaining C — 37 different airlines (layers), that include both major companies 
(like Lufthansa or Air France), and low-fares (low-cost) companies (as Ryanair or 
Easyjet). Each layer € in this multiplex representation is a graph 
tf=U[ i £ i *\ = (JV,^) with A^=Af = 450 nodes and X* links that models a 
single airline. An example of such networks is shown in Fig. 1. The ensemble of all 
these layers constitutes our multilayer system, that we will call the complete set. 
We will also consider the subset of major airlines, that will be a multiplex network 
made of C' — 18 layers, and the subset of low-cost companies, with C" — 10 layers. 
Note that the remaining airlines, such as cargo airlines, constitutes a marginal 
small subset and therefore its analysis is residual. 

Topological indexes. In this section, we present a summary of the topological 
measures used throughout the paper. Note that the considered topological measures 
are essentially defined for classic monoplex networks, and their extensions to the 
multiplex setting is an exercise, whose details are here shown. 

One of the most basic topological parameter of a complex network G — (Af,£) is the 
degree distribution P(k) which is defined as the probability that a node chosen uni- 
formly at random has degree k, or equivalently the fraction of nodes in the network 
having degree fc 3 * 35 . Since broad distributions often display high noise levels at the end 
of their tails, here related to the low abundance of highly connected nodes, it is 
convenient to consider the cumulative distribution P>(fc). Cumulative distribution 
P>{k) is the probability that a randomly chosen node has a degree equal or greater 
than k, i.e. 



5(G) — max — 
' isM N 



(4) 



where N, is the number of nodes of the maximal connected subnetwork of G con- 
taining node i. 

If we take a node with degree 0<k< | JVj , the Rich-club coefficient R(k) 39 is given by 



R(k) = 



m ( 



W)nm \¥ (*) 



24{k) N >k (N >k -l) = <j>(k) 
'N >k (N >k -l) 2<j>'(k) <t>'(k)' 



(5) 



where 



p > (fc)= 



(i) 



(i) <j)(k) is the number of edges connecting nodes of degree greater or equal to k 
(called the link abundance), 

(ii) <j>(k) max is the maximum number of links that can exist between nodes of degree 
k, 

(Hi) (j)'(k) is the link abundance on a network with the same degree sequence of the 
original but with connections randomly shuffled. 

(iv) <j>'{k) max is the maximum number of links that can exist between nodes of 
degree fcona network with the same degree sequence of the original but with 
connections randomly shuffled. 

(v) N > i c is the number of nodes with degree greater or equal to k. 

If, for a certain value of k, R(k) > 1 for some 0<k<N— |JV*| , then we say that G has 
a Rich-club. Note that in the plots presented in this paper, we decided to present the 
ratios <j>(k)/(j)(k) max and <l>'(k)/<J)' (k) max instead of R(k). The randomization, in our 
case, is repeated 1,000 times, while the shuffling is repeated 10,000 times to ensure a 
robust statistical sampling. Note that for the ATN network, having a size ofN = 450 
nodes, the number of random shuffling steps is large enough to guarantee that the 
resulting network is fully randomized. This randomization method is known as 
Markov Chain Monte Carlo Algorithm 43 . However, for bigger graphs other methods 
are recommended so to minimize the computation cost for producing reliable ran- 
domized networks, see the work by Del Genio et al. 44 . 

Next, we describe the layer merging procedure used to study the evolution of the 
topological measures and the behavior of the layers in the major airline and low-cost 
multiplex sub-networks. 

If we fix a subset of layers {G f ; t=t\, • • • ,^ m } to merge together, we construct a 
monoplex network G' — (Af,£') (i.e. a classic network with only one layer) given by 

G'={jG l >. 

This network G' is obtained by projecting all the m layers onto one and by con- 
verting multiple links into single ones. 

Now if we fix m, we look for all the possible mergings of m layers, The number of 
different configurations to arrange n layers into groups of size m without repetitions is 

given by CJj, — J , therefore if we want to compute a topological measure on the 
ensemble of m layers, we should first compute it on each of the mergings, and then 
average over all CJL possible configurations. However, when the number of possible 
configurations exceeded a certain threshold, we operated a random sampling over 
500,000 mergings in order to avoid the growth of the computation time. Throughout 
the paper the operator <■) denotes the average over the elements of the ensemble. As 
an example, if we want to compute the clustering coefficient over an ensemble C, we 
compute: 



where N(k) is the number of nodes with degree k and N — \Af\ is the total number of 
nodes in the network. 

The average path length 4 L(G) is the average length of the shortest paths among all 
the couples of nodes in the network, i.e. 



*-(£) = — I Yd, 



(2) 



where d» is the minimum number of hops one has to make to go from node i to node j 
in G (the distance from i to j). Note that this definition diverges if G is not connected, 
since (L may be infinite. One way to avoid this divergence is considering the average 
only on the largest connected component, and an alternative approach that has been 
shown very useful in many cases is considering the harmonic mean of the distances. 
The (local) clustering coefficient 4 Cj of a node iej\f is defined as 

where e, is the number of neighbors of i which are mutual neighbors, and kj is the 
degree of node i. Therefore the (local) clustering coefficient of a node i is the ratio 
between the number of neighbors of i which are mutual neighbors and the maximal 
possible number of edges between neighbors of i. The (average) clustering coefficient 
C of a graph is the arithmetic mean of c, over all its nodes. 

The giant component S(G) is the largest connected component of G and the size of 
the giant component is the proportion of nodes in the network that belong to the giant 
component, i.e., 



where N com f, is the number of elements of C and C, is the average clustering of the 
network obtained merging together the layers corresponding to ieC. 
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